首页> 外文OA文献 >On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

【2h】

On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

机译：关于最佳臂识别的序贯消除算法多武装强盗

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the best-arm identification problem in multi-armed bandits, whichfocuses purely on exploration. A player is given a fixed budget to explore afinite set of arms, and the rewards of each arm are drawn independently from afixed, unknown distribution. The player aims to identify the arm with thelargest expected reward. We propose a general framework to unify sequentialelimination algorithms, where the arms are dismissed iteratively until a uniquearm is left. Our analysis reveals a novel performance measure expressed interms of the sampling mechanism and number of eliminated arms at each round.Based on this result, we develop an algorithm that divides the budget accordingto a nonlinear function of remaining arms at each round. We provide theoreticalguarantees for the algorithm, characterizing the suitable nonlinearity fordifferent problem environments described by the number of competitive arms.Matching the theoretical results, our experiments show that the nonlinearalgorithm outperforms the state-of-the-art. We finally study theside-observation model, where pulling an arm reveals the rewards of its relatedarms, and we establish improved theoretical guarantees in the pure-explorationsetting.

机译：我们考虑了多臂匪的最佳臂识别问题，该问题仅着眼于探索。给予玩家固定预算以探索无限的武器，并且独立于固定的未知分布独立获取每个武器的奖励。玩家的目标是识别具有最大预期奖励的手臂。我们提出了一个通用的框架来统一顺序消除算法，在该算法中，迭代删除臂，直到剩下唯一的臂为止。我们的分析揭示了一种新的性能指标，该指标用采样机制和每轮消灭武器的数量表示，基于此结果，我们开发了一种算法，该算法根据每一轮剩余武器的非线性函数将预算进行划分。我们为算法提供了理论上的保证，通过竞争性武器的数量来描述适合于不同问题环境的非线性。结合理论结果，我们的实验表明非线性算法的性能优于最新技术。最后，我们研究了侧观察模型，在该模型中拉动一条手臂可以揭示其相关手臂的收益，并在纯探索环境中建立了改进的理论保证。

著录项

作者
Shahrampour, Shahin; Noshad, Mohammad; Tarokh, Vahid;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models [J] . Emilie Kaufmann, Olivier Capp??, Aur??lien Garivier Journal of machine learning research . 2016,第1期

机译：多臂强盗模型中最佳武器识别的复杂性
2. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems [J] . Even-Dar Eyal, Mannor Shie, Mansour Yishay Journal of machine learning research . 2006,第Jun期

机译：多臂强盗的行动消除和停止条件以及强化学习问题
3. Intelligent and Reconfigurable Architecture for KL Divergence-Based Multi-Armed Bandit Algorithms [J] . Santosh S. V. Sai, Darak Sumit J. IEEE transactions on circuits and systems. II, Express briefs . 2021,第3期

机译：基于KL发散的多武装强盗算法的智能和可重构架构
4. Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting [C] . Jamieson Kevin, Nowak Robert Annual Conference on Information Sciences and Systems . 2014

机译：固定置信度下多臂匪的最佳臂识别算法
5. Essays on sequential analysis: Multi-armed bandit with availability constraints and sequential change detection and identification. [D] . Yamazaki, Kazutoshi. 2009

机译：关于顺序分析的文章：具有可用性约束以及顺序更改检测和识别的多臂匪。
6. Non Stationary Multi-Armed Bandit: Empirical Evaluation of a New Concept Drift-Aware Algorithm [O] . Emanuele Cavenaghi, Gabriele Sottocornola, Fabio Stella, 2021

机译：非固定多武装强盗：新概念漂移感知算法的实证评估
7. On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits [O] . Shahrampour, Shahin, Noshad, Mohammad, Tarokh, Vahid 2017

机译：关于最佳臂识别的序贯消除算法多武装强盗

On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

摘要

著录项

相似文献

相关主题

期刊订阅